HP2PC: Scalable Hierarchically-Distributed Peer-to-Peer Clustering
نویسندگان
چکیده
In distributed data mining models, adopting a flat node distribution model can affect scalability. To address the problem of modularity, flexibility and scalability, we propose a hierarchically-distributed peer-to-peer architecture and algorithm for data clustering (HP2PC). The architecture is based on a multi-layer overlay network of peer neighborhoods. Supernodes, which act as representatives of neighborhoods, are recursively grouped to form higher level neighborhoods. Peers at a certain level of the hierarchy cooperate within their respective neighborhoods to perform clustering. Using this model, we can partition the clustering problem in a modular way, solve each part individually, then successively combine clusterings up the hierarchy where increasingly global solutions are computed. The algorithm was applied to a distributed document clustering problem and achieved decent speedup with comparable clustering quality to the centralized approach.
منابع مشابه
P2P Network Trust Management Survey
Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...
متن کاملMulti-objective optimization based privacy preserving distributed data mining in Peer-to-Peer networks
This paper proposes a scalable, local privacy-preserving algorithm for distributed peer-to-peer (P2P) data aggregation useful for many advanced data mining/analysis tasks such as average/sum computation, decision tree induction, feature selection, and more. Unlike most multi-party privacy-preserving data mining algorithms, this approach works in an asynchronous manner through local interactions...
متن کاملConnectivity Based Node Clustering in Decentralized Peer-to-Peer Networks
Connectivity based node clustering has wide ranging applications in decentralized Peer-to-Peer (P2P) networks such as P2P file sharing systems, mobile ad-hoc networks, P2P sensor networks and so forth. This paper describes a Connectivity-based Distributed Node Clustering scheme (CDC). This scheme presents a scalable and an efficient solution for discovering connectivity based clusters in peer n...
متن کاملA Quorum Based Distributed Mutual Exclusion Algorithm for Multi-Level Clustered Network Architecture
Different permission-based algorithms have been proposed for the solution of the Mutual Exclusion problems. With the emergence of peer-to-peer computing, the distributed applications spread over a large number of nodes. Cluster-based solutions are scalable for large number of participants. Some algorithms are proposed using cluster topology. But the number of participating nodes is increasing e...
متن کاملA Scalable Semantic Indexing Framework for Peer-to-Peer Information Retrieval
The exponential growth of data demands scalable and adaptable infrastructures for indexing and searching a huge amount of data sources with high accuracy and efficiency. Existing centralized search engines are not scalable and suffer from single-point-offailures. The recent work on P2P index construction partitions the document vectors either randomly or statically, making it difficult to trade...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007